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Abstract. We study extremely diluted spin models of neural networks in which the 
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according to stochastic equations which on average aim to reduce frustration. The 
(fast) neurons and (slow) connectivity variables equilibrate separately, but at different 
temperatures. Our model is exactly solvable in equilibrium. We obtain phase diagrams 
upon making the condensed ansatz (i.e. recall of one pattern). These show that, as the 
connectivity temperature is lowered, the volume of the retrieval phase diverges and the 
fraction of mis-aligned spins is reduced. Still one always retains a region in the retrieval 
phase where recall states other than the one corresponding to the 'condensed' pattern 
are locally stable, so the associative memory character of our model is preserved. 
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1. Introduction 

Most statistical mechanical studies of recurrent neural networks have traditionally been 
concerned with systems in which the dynamical variables are either the neurons (see 
e.g. P3 121 El E] or the reviews [Hj and references therein), or their interactions (or 
synapses, see e.g. 13121101110] or the reviews [12111011111 and references therein). The 
first type of processes describe network operation, whereas the second correspond to 
learning. These areas have by now been investigated quite extensively. In contrast, 
only a modest number of studies involved coupled dynamical laws for both neurons and 
interactions [101 HOI HH UHl HOI 1201 12ll I22j, to reflect the complex dynamical interplay 
between synapses and neurons found in the real brain. The approach usually adopted 
in the latter studies, to obtain analytically solvable models, is the introduction of a 
hierarchy of adiabatically separated time-scales, such that the fast variables (taken to 
be the neurons) are in equilibrium on the time-scales where the slow variables (the 
interactions, taken to be symmetric) evolve. One can also introduce further levels in 
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the hierarchy by introducing different classes of interactions, each evolving on different 
characteristic time-scales [221 ■ The resulting formalism involves nested replica theories, 
with Parisi matrices j2S] in which the number of blocks at each level is the ratio of 
temperatures of subsequent levels in the hierarchy of equilibrating degrees of freedom. 
Such models can also served to derive Parisi's replica symmetry breaking scheme |24j . 
In neural network studies the dynamics of the interactions have usually been governed 
by Langevin equations in which the deterministic forces are proportional to expectation 
values of neuronal pair correlations (with the neuron state statistics corresponding to 
Boltzmann equilibrium, given the instantaneous values of the interactions), potentially 
biased to reflect the possibility of recall of a pattern. In jTHl the interactions were 
taken to evolve away from an initial state given by Hopfield's PP interaction matrix, 
with an extensive number of stored patterns. There it was found that for low interaction 
temperatures the network collapsed into an undesirable so-called 'super-ferromagnetic' 
state, whereas for negative replica dimension (corresponding to anti-Hebbian learning) 
the storage capacity of the network was found to be enhanced. 

All papers dealing with the theory of coupled neuronal and interaction dynamics 
published so far assumed full connectivity: each neuron interacted which each other 
neuron, with the magnitude and sign of the interactions evolving in time. Here we 
propose and study a model of a symmetrically diluted recurrent neural network in 
which the geometry (or connectivity) is allowed to change slowly. On time-scales where 
the neuron variables are in thermodynamic equilibrium, the microscopic realization 
of the (discrete) dilution variables (reflecting the connectivity) is allowed to evolve 
slowly and stochastically, driven by forces aiming at a reduction of global frustration, 
without however changing the actual values of the bonds (the latter are frozen, given by 
Hopfield's PP recipe). It has been known that one may store information in recurrent 
neural networks solely by eliminating frustrated bonds, but this has always been done 
by hand (see e.g. and references therein). Here the system is allowed to adapt 
its geometry autonomously. It should be emphasized that there is an important 
difference between having dynamic bonds with Hebbian type dynamical laws, as in 
P3l Uni im 1201 1211 1221 , and the present situation of having dynamic geometry with fixed 
Hebbian values for active bonds. The former definitions imply irreversible modification 
or even elimination of stored information, whereas in the present paper, since the values 
assigned to the active bonds are not modified, the slow adaptation is fully reversible 
(one can always return to random dilution) and all stored information is retained. 

The scaling with the system size N chosen for the average connectivity c in the 
system (the average number of bonds per spin) will have a strong influence on the 
structure of the resulting theory. In this first paper we consider the so-called 'extreme 
dilution' regime [2^1, defined by limTv^ooC"^ = \imN^ooc/N = (a second paper will 
be devoted to the finite connectivity regime, where c = 0{N^) as N —>■ oo). We 
solve our coupled spin and geometry dynamics model analytically, in replica symmetric 
ansatz. We find that, as a result of the connectivity adaptation, the network geometry 
becomes more ordered to boost retrieval of condensed patterns, as a result of which 
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the system's retrieval phase is enhanced compared to that of the corresponding network 
with a quenched random connectivity matrix as studied in [2^] , and that the fraction of 
'misahgned spins' is reduced as the temperature of the connectivity variables is lowered. 
Yet one still retains regions in the phase diagram where the alternative (presently non- 
condensed) pure retrieval states remain locally stable, so that the system continues to 
function as an associative memory. 



2. Definitions 



We study diluted Hopfield [T] type recurrent neural networks, with (fast) binary neurons 
cTj G {—1,1} (denoting quiet versus firing states) and i = 1, . . . , N. The geometry of the 
system is defined by connectivity variables Cij e {0, 1}, with Cji = Cij and Cu = 0. Our 
neurons evolve according to Glauber-type local field alignment at temperature T = 
with the fields defined by hi = Z]j ^ Z]^=i ^f^^cr^, i.e. with Hebbian interactions 
whenever Qj = 1 (when a bond {i,j) is present). For frozen geometry {cij} our Ising 
spin neurons would equilibrate to a Boltzmann state characterized by the Hamiltonian 

H,icT,c) = -j:'-^±^n^a,a, (1) 

i<j ^ At=l 

Here cr = (ai, . . . , a^) and c = {Qj}. The {C,^} G { — 1, 1} with /i = 1, . . . ,p represent p 
fixed patterns = (^f, . . . ,^^) to be stored and hopefully recalled. Instead of frozen, 
we now take our geometry to also evolve in time, albeit on time-scales much larger than 
those of the neuronal relaxation (so the neurons can always be assumed in equilibrium, 
given the instantaneous geometry). This slow process is again taken to be of a Glauber 
type, but at temperature T = (3~^ and with the connectivity Hamiltonian 

H^{c) = - ^ log Z,{c) + i log (-) Y: c., (2) 
Zf(c) = ^e-^^f(^'^) (3) 

(T 

The second term in Q acts as a chemical potential, ensuring an average number of c 
connections per neuron. The pre-factor 1/(3 will be found helpful later. 

The properties of our system at the largest time-scales, where also the geometry 
has equilibrated, are characterized by the partition sum of the slow variables: 

= ^e'/^^^^^) = [Zf(c)]'^/^e-^°^(^)^«<.^- (4) 

c c 

This sum is interpreted as describing n = [3 / [3 replicated copies of the fast system, 
leading to a replica theory with finite replica dimension n. Minimization of Hs{c) 
should give a 'smart' arrangement of the geometry {cij}, taylored to the realization of 
the patterns, but constrained to give an average connectivity c. In the remainder of this 
paper we calculate phase diagrams and the fraction of mis-aligned spins. Phases are 
characterized by the values of the replicated overlap and spin glass order parameters 

= hm N-'YW^ QaP = Jim N'' ^ (af af ) (5) 
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Here (. . .) denotes averaging over all geometries {cij} and all spin-configurations cr" in 
each of the replicas a = 1, . . . ,n, with the Boltzmann measure associated with 



X 



{ ^ E % E ere; E - log (t) E % [ (6) 



c — — — \ r , 



3. Equilibrium analysis 

5.1. Calculation of the RS free energy 

The thermodynamic properties of the stationary state, with equilibrated geometry, are 
derived from the asymptotic free energy per spin / = — limAr^oo(/3iV)~^ log Zg. Upon 
performing the trace over all geometries in (jH) one obtains, with CTj = {a}, . . . , cr"): 

Zs= Y n fi + ^^'^^-■^^^^'^'■'^^^1 (7) 

In evaluating the free energy we make the usual 'condensed' ansatz: only a finite number 
r of patterns will be structurally correlated with the system state. The remaining ac — r 
patterns can be treated as frozen disorder, over which the free energy may be averaged. 
For the result we write [/]dis- In this paper we work within the connectivity scaling 
regime of extreme dilution, where limTv^oo c/N = limjv^oo = 0. Now one obtains 

-^[/]dis= lim 1 log y e^S-('^'-'^^)^M<.«rc;+^E,,('^-'^.)^+c^(f) (8) 

(modulo irrelevant additive constants). We define the familiar pattern and state 
overlaps ma^{a) = Y^iit^^i and ^'^/^({cr}) = N'^^Y^i^^t^^i- They are introduced 
via appropriate ^-distributions, so that the spin traces can be carried out. This results 
in the usual type of steepest descent expression for [/]dis (again modulo constants): 

[/]dis = extr|„,^,,^^} I 7^ E + :i E E "^a/. (9) 

'log E e'^^"^A.<r''""''^'''^" + 5°/3'Ec.7^^('?-/3'^"'^/3\ I (10) 
ai...(T„ ' ^ ) 

where {g{^))^ = i}r fi'(^)- The parameter c represents the ensemble 

averaged connectivity. This follows upon adding suitable generating fields to the slow 
Hamiltonian: Hs{c) — » Hs{c) + ^Yi<jCij- Repeating the above calculation with the 
added fields shows that limTv^oo ^i<j ^ = liniA-^o = 1, which proves our claim. 

We next make the replica-symmetric (RS) ansatz for the physical saddle-point: 
^a^i = for all (a, /i) and q^fs = <? + ^^/^(l — <?) for all (a, (3), keeping in mind that the 
replica dimension n can take any non-negative value: 

= extr|„^,,| U E + i"/^[2? +(^- 1)?'] 
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log [dz cosh"[/3(^m^eM + ^v^)])£[ 



Variation of {m^, q} gives the saddle-point equations for our RS order parameters, with 
the short-hand H = (SiJ^^iKr^n^n + z^faq), which are of the familiar form 

/ SDz tanh(H)cosh"(S) \ 

JDz cosh"(S) ^''^ 

tanh^(5)cosh"(S) \ 
\ JDz cosh"(S) ^ ^ 



The physical meaning of the RS order parameters is = limAr^oo N {^i<^i) and 
q = limAT^oo N^'^ J2i (o'j)^, as usual. 

3.2. Phase transitions and phase diagrams 

If one simplifies matters further by assuming only one pattern to be condensed, i.e. 
= m6fj^^i, then equations ()12ll3p reduce to 

JDz tanh[P (rn + Zy/aq)]cosh.'^[P {ni + z^/aq)] 
" JDz cosh"[/3(m + 2y^)] 



JDz tanh [(3 {rn + z^/aq)] cosh"' [f3{rn + z^/aq)] 
^ ^ JDz cosh"[/3(m + 2y^)] ^ ^ 



These are recognized to be identical to those of the finite n model studied in [211 "^^ 
re-define the parameters in the latter according to 

J^^^mlk m J^'^'^q/k apq (16) 

This makes sense, since the n —>■ limit of our present model (i.e. the symmetrically 
extremely diluted Hopfield model with quenched random connectivity [20]) is known 
to map onto the n ^ limit of |2Z] (i.e. the SK model [2H])- Clearly one finds 
simplified equations for the special dimension values n = 1 (equivalent to having 
annealed geometry) and n = 2, where the Gaussian integrals can be done. For instance, 
at n = 1 the equation for m reduces to m = tanh(/3m), whereas for n = 2 one finds 

sinh(2/?m) _ cosh(2/3m) - e'^^^'^ 

~ ^ ■ ~ cosh(2/5m) + e-2"/3^9 ^ ~ cosh(2/3m) + e-^"/?'" ^ 

Our RS equations admit three phases: a paramagnetic phase (P) with m = g = 0, a 
recall phase (R) where m ^ and g > 0, and a spin-glass phase (SG) where m = 
but g > 0. Since deriving the RS phase transitions has been reduced to appropriate 
translation of the results found in ^J, we will here simply mention the outcome: 



For sufficiently small a one will find a P— s>R transition at a finite temperature. For 
a < ac = this transition is second order, and occurs at Tr = 1. 
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Figure 1. RS phase diagram in the space of control parameters. We show the critical 
temperature(s) as surface(s) over the (a, n) plane. The high temperature phase is 
paramagnetic (P). At low temperature we find the retrieval phase R. For {a,n) values 
with two critical temperatures, the latter define the boundaries of a spin-glass phase 
SG. The P^R transitions are second order for a < ■^2_2 , and first order elsewhere. 
The P^SG transitions are second order for rt < 2 and first order elsewhere. For large 
a the SG— i'R transitions become second order, but for small a they are first order. 



• For larger a, lowering temperature will lead first to a P^SG transition, followed 
at some yet lower temperature by a SG^R transition|. For n < 2 the P— i>SG 
transition is second order, and occurs for Tsg = y/ci- 

• The SG-^R transition is second order for a oo, where its transition temperature 
tends to Tc = n, but first order for sufficiently small a. 

• The effects of increasing the replica dimension n are (i) a reduction of the size in 
the phase diagram of the SG phase, and (ii) a change of the orders of the P— »-R 
and P— >SG transitions, from second order (for small n) to first order (for large n). 

Numerical solution of equations p4pi5|l leads to the RS phase diagram drawn in figure Q 
Figure 121 shows intersections of this diagram in the planes of constant replica dimension 
n = 0.1 and n = 2. All transitions discussed and drawn above refer to bifurcations 
of locally stable solutions, since for recurrent neural networks the time-scales where 
thermodynamic stability would become an issue are in practice never reached. 

We finally turn to replica symmetry breaking (RSB). The location in our phase 
diagram of second order RSB phase transition follow upon inspection of the eigenvalues 
of the Hessian. Since our model can be mapped onto the nonzero-n SK- model |27j . 
we can read off the eigenvalues from |29. . The dangerous eigenvalue Arsb is the one 

I For small values of n, the latter SG^R transition is expected to disappear when replica symmetry 
breaking is taken into account. 
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Figure 2. Intersections of tlie phase diagram shown in figure at n = 0.1 (left) 
and n — 2 (right). We have paramagnetic (P), recall (R) and spin-glass (SG) phases. 
We note that there is no critical value for a above which recall is no longer possible. 
Instead the SG^R transition line will approach the line T = n for large a. Since RSB 
phenomena appear to be confined to n < 1 (see below), this is not an artifact of the 
RS assumption. For large n all phase transitions ultimately become first order. 




Figure 3. Location of the AT instability ric, shown as a function of the inverse 
temperature f3 — T^^ ,and for a number of different storage ratios a. Replica symmetry 
breaking is seen to be limited to values of n below 0.32. We also note the non-monotonic 
dependence on temperature of the critical dimension ric for fixed a. 



associated with the so-called replicon mode: 

Arsb =af3^[l-a(3^[l-2q + h{m,q)] 

JDz tanli^ [P (rn + Zy/aq)]cosh^[P {rn + Zy/Tiq)] 
^""'^^ " JDz cosh"[/5(m + ;2^)] 



(18) 
(19) 
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Replica symmetry is stable only if Arsb > 0. For each combination (a,T) one finds a 
critical value nc{a, T) (the AT line) below which replica symmetry is unstable. Examples 
of the the behaviour of nc{a,T) are shown in figure El Replica symmetry breaking is 
found to occur only for n < 0.32. 

Compared to diluted neural network models with static random geometry, the main 
effect of introducing dynamic geometry (with the present Glauber dynamics, aimed at 
reducing frustration) is to reduce the spin-glass phase in favour of the recall phase. 
The geometry adjusts itself autonomously in order to retrieve the condensed pattern 
optimally, to such an extent that for sufficiently low temperature there is no upper limit 
on the storage ratio (provided we do not leave the 'extreme dilution' scaling regime 
lim^v^ooc/A^ = limjv^ooC"^ = 0). This then raises the question of whether the other 
(non-condensed) patterns can be retrieved at all after the geometry has been taylored 
to the recall of one specific condensed pattern. This is investigated in section El 

4. Fraction of misaligned spins 

We expect the observed improvement of retrieval performance due to the slow geometry 
dynamics to be reflected in a reduction with increasing n of the fraction of frustrated 
bonds in the system. To verify this we calculate a different but similar quantity: the 
fraction of misaligned spins, i.e. those where cTj and local field hi have opposite sign: 



To calculate this object one could introduce further replicas, but here we follow an 
alternative route: we solve our model first for finite c, in which case joint replicated 
spin-field distributions (in terms of which (j) can be expressed) become the natural order 
parameters, followed by taking the limit c — > oo. 

Calculation of the joint spin- field distribution 



To do so we have to adapt and generalize the calculation in e.g. [30j by first introducing 
the 2P so-called sub-lattices jHI], with = {Q, . . . , ^f): 



in e.g. |H21- In each sublattice we may define a joint distribution for replicated spins 
and fields, and (with a modest amount of foresight) conjugate fields: 



where cr e {-1,1}", h, h G IR", and h^{{cr}) = Ej^i^i " ^j^f- In evaluating 
the free energy per spin we write the fast Hamiltonian in terms of replicated fields, 
and introduce (j22|) by inserting suitable integrals over (5-functions. This is done first 



(20) 





(22) 
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only for discrete values of h, with the continuum limit (converting integrals into path 
integrals) to be taken after the thermodynamic limit. We abbreviate {dPdP} = 
n^,cr,h,h[^^^('^' h)dP^{a, h, h)] and find 

-(3f= lim — logj] e^^>'^'''''"'^""'°'^^^^^'<^''^ 
Af^oo ^ cri...cr" 



lim -log {dPdPje 



Af(^^/dhdh P^(Cr,h,h) iRj(C,h,h)+i/3(Cr-h) 



X 



n 



(2vr) 



Kj 



1 + iLe-^(C-^.)I(h.-0-.)+{h,-<T,)]' 



lim -log {dPdPje 



Ar(^^/(ihdh P<^(cr,h,ii) iP^(cr,h,h)+i/3(cr-h) 



X e 



X 



n 



(27r)" 



hi, hi) 



lim -log {dPdPje 

N^oo iV 



Af(^^ Jdhdh P^(Cr,h,h) iP^(Cr,h,h)+i/3(Cr-h) 



X e 



\N((}2(j(T' I dt^dh'dhdh' p^(cr,h,ii)P^,{cr',h',h')e--i(^-^ )[(h.cr')+(h'.cr)]^^^^^ 



^(log / 



dhdh zh-h 



-iP^(Cr,h,h) 



X]crG{-i,i}" ' 

exV,p||(E /f^hrfhP^(cr,h,h) [zP^(o-,h,h) + ^/3(o--h)])^ 
+^c(( ^ 1 rfhrfh'rfhrfh' P^(cr, h, h)P^i(T', h', h')e-^(^-^')[(h-'^')+(h'-cr)]^^^^, 



+ lo£ 



dhdh , 



ihh 



(2vr) 



-iP^(cr,h,h) 

cre{-i,i}" 



(23) 



Extremization with respect to P^(<t, h,h) and P^(<t, h,h) gives the following two 
saddle-point equations: 

PAcr, h, h) = ic{Y, I Pfi'i^'^ h')e~^(^-^')[(^-'^')+('^'-'^)l)e' + -ip{(T ■ h) (24) 
cr' •' ^ <; 2 

ihh-iP^(Cr,h,h) 

P^(0-,h,h) 



-jp^(cr',h',h') 

Insertion of (j^^ into (f^3j) gives a saddle-point equation in terms of P only: 



(25) 



P^(o-,h,h) =Z^^e 2^ ^ ^ ^26) 
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with denoting a normalization constant. According to (j22j) . the physical meaning of 
the saddle-point is 



P.icT, h, h) = lim — 5: {5a,aAi^ - h.({^})]<5[h - h,({cT})]) (27) 

We next make the one pattern condensed ansatz (this is not essential for being able 
to proceed, but will simplify and compactify our derivation significantly), which here 
implies P^(cr,h, h) = P^^(cr,h, h), and we send c — > oo. As a result (^ ■ ^')/\^ = 
C,i^'i/y/c + ^/az where z is a zero-average unit- variance Gaussian variable, and 

P^{(T, h,h) = Z^-le'''-''+5'3(0--h)-(Ecr' I^i^' i^^'(^T^h')[^(«')[{h•0■')+(h'■0■)] + f [(h.<T')+(h'-<T)]2]>^, 

(28) 



In the right-hand side of (j28j) we are seen to need only the following moments of our 
distributions (which include the previously encountered {ma,qai3})'- 



(J 

Ik' 



cr cr 
cr 

Integration by parts over the fields in shows that ka = —^(3ma, K^p = —\(3qai3, 
and La/s = —\(3'^qap- The replicated joint spin-field distributions can now be written as 

g5/3m-Cr+ia/32cr-q<T-^{h-?m-a/3qCr)q-l(h-^m-a/3qCr) 
P^i'^^^ — gC/3m-Cr'+ia/32cr'-qCr'-^(h'-gm-Q:/3qCr')q-i(h'-,tm-a/3qCr') 

with m = {rria} and q = {qajs}- The latter obey the following familiar closed equations 
which in RS ansatz lead one back to (jl4|) . as they should: 



_ /, S^a.eg"-^-^^""^^--^^ \ ,,,, 
4-2. Fraction of mis- aligned spins in RS ansatz 

The fraction defined in (pn|) can be written as = (0^)^, where the sublattice fractions 
(f)^ are expressed in terms of the replicated distributions (|^^ in the following way: 



1 1 f 1 
- - - E y P^{(T, h)- E ^7Sgn[/i. 



7 



(32) 



Slowly evolving geometry in recurrent neural networks I 



11 




Figure 4. The fraction 0rs of misaligned spins as function of temperature, in RS 
ansatz. for integer values of n between 1 and 5, and a = 0.1, a = 0.5 and a = 1 
respectively, as a function of temperature. The degree of alignment of spins to their 
local fields increases with n outside the paramagnetic phase. In the paramagnetic 
phase there is no dependence on n. One sees clearly the effect of the first order phase 
transitions, at a = 0.1 only for n = 5, and for a = 0.5 and a = 1 for all n > 2. 



Wc sec that the 01 = At this stage we make the RS ansatz, putting ma = m and 
Qai3 = Q + ^a,fl(l — (l): which results in 

1 1 SDz Ea aie^^c j^^ sgn[m + af3{ai + q E/3>i a^) + y/ax] 



RS 



2 2 /L)^E^e^Sa'^"(™+^v^) 



2 2 j£)z 2 cosh[p {m + Zy/aq)] 

We carry out the spin summations over cTq with a > 1. A shift in the complex plane 
for the variable z in the numerator, z ^ z — s/q^^x — iy), followed by integration over y 
and some simple manipulations, converts this expression into 



1 1 JDxDz sgn[zy/q + x^/T^ + PVa{l-^) + ^]eP('V^+"^) cosh"-^[/?(^^ + m) 



aq — m)] 



2 4 JDz cosh"[/5(;2^ + m)] 

1 JDxDz sgn[^y^ + xy/l=^ + py/a{l-q) - ^Je^^^V^-"*) cosh'*-^[ 



1 1 

2 ~ 4 



JDz Erf 



\/2a(l-q) 



/ Dz cosli^ [(3 {zy/aq - m)] 
^p{z^+m) cosh"-^[/3(z^/ag + m)] 



JDz cosh"[/9(2;^/«g + m)] 



JDz Erf 



z ^/aq+l3a ( l—q) — m 



V2"(l-9) 



^I3{z^-m) cosh"-^[/3(;s.yag - m) 



4 JDz cosh"[/3(z^ - m)] 

In the paramagnetic state, where m — q — this simplifies further to 



1 1 



a. 



^RS 



Erf[/3W- 



(33) 



(34) 
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In the recall and spin-glass states the evaluation of (|33p requires the (numerical) 
solution of the RS order parameters {m,q} from (fT^ . Examples of the resulting 
curves as functions of temperature are shown in figure IH for a G {0.1,0.5,1.0} and 
n e {1,2,3,4,5}. We note that for these values of n replica symmetry should be 
stable. The fraction of misaligned spins is seen to decrease with increasing n (i.e. 
with decreasing connectivity temperature), as expected. This effect becomes more 
pronounced for larger a, where the amount of frustration to be reduced by the geometry 
dynamics should indeed be largest. In the paramagnetic phase (large T) we see that 
0RS is independent of n, in accordance with 

4-3. Comparison with numerical simulations 

In order to perform numerical simulations, we need an explicit stochastic dynamical 
equation for updating of the network geometry variables c = {c^}, which must approach 
the appropriate Boltzmann equilibrium state characterized by the slow Hamiltonian 0. 
Here we used a Glauber type Markov process, where candidate bonds Cij are drawn 
randomly at each iteration step and then flipped with probability W[FijC] c], where Fij 
denotes the bond switch operator defined by F^Cij = l — Cij, F^Cki = Cki if (i, j) 7^ {k, i): 



Detailed balance is built in. Upon inserting the slow Hamiltonian and using the 
scaling property lim^v^ooc/A^ = of our present extreme dilution regime, one can for 
large N rewrite our transition probabilities as 



where, as before, (...) indicates an equilibrium average for the fast system, in Boltzmann 
equilibrium with Hamiltonian (^, for a given connectivity matrix c. 

In the present type of systems with multiple adiabatically separated time-scales and 
nested equilibrations, the verification of theoretical results by numerical simulations 
is known to be a highly demanding task. Even without the evolving geometry, full 
equilibration of the spins requires relaxation times which diverge with N faster than 
polynomially. If on top of this one aims to also approach geometry equilibrium, which 
involves (9(A^^) stochastic degrees of freedom, the system sizes accessible in practice for 
numerical experimentation are small. Thus profound finite size effects are unavoidable. 
It turns out that, when simulating the process (jHSl), geometry equilibration times are 
indeed extremely long, especially close to phase transitions. This limits our ambitions 
regarding size, with the standard CPU resources at our disposal, to the order of ~ 10^ 
spins. Since in our chosen scaling regime of extreme dilution we have to simultaneously 
minimize and cN~^, we have in our numerical experiments chosen c = 

Different macroscopic quantities could in principle be used for testing our theory 
against experiments. The advantage of observables such as m and is that they can 




(35) 




^(2c., - l)[log (^) + ^E^M]] I 



(36) 
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Figure 5. Comparison between simulation measurements (all with TV ~ 200) and 
RS theoretical predictions for the fraction of misaligned spins </) = A^^^ 6'[— Ci/ii] 
(where hi is the local field at site j), as functions of temperature. The data shown 
refer to a = 0.5 with n = 2 (simulations: connected squares; theory: dashed lines) or 
71 = 3 (simulations: connected circles; theory: dotted-dashed lines). Due to the need 
to equilibrate two nested disordered processes, conventional computer resources limit 
experimentation to modest values of N . In spite of the resulting finite size effects, the 
graph does show satisfactory qualitative agreement between theory and experiment. 



be measured instantaneously, in contrast to the spin-glass order parameter q. Here we 
have opted for the fraction of misaligned spins 0. The results are shown in where we 
observe qualitative agreement between theory and simulations. The deviations observed 
in such experiments are found to decrease with increasing system size A^, albeit slowly. 

5. Stability of non-condensed retrieval states 

The significant enlargement of the retrieval phase caused by our geometry adaptation 
(see e.g. figures Q and |2|) could have as a drawback that retrieval of patterns other 
than the condensed one becomes impossible. Here we address the question of whether 
the present 'tayloring' of the geometry variables {cjj} to one condensed state will leave 
a finite basin of attraction for the non-condensed patterns, or whether recalling the 
latter requires a rewiring of the system (e.g. by temporarily raising the temperature T) 
to undo the established geometry. For large a most retrieval states must be unstable 
for any given geometry in the extreme dilution scaling regime, since optimal capacity 
calculations a la Gardner for diluted networks predict a finite storage capacity [HH] . 

To answer our question we will study a second (fast) spin system of spins 
T = {tj}, governed again by the fast Hamiltonian (Q), with patterns and geometry 
identical to that of the first. In particular, the connectivity statistics are again given by 

P(c) = Z-^e-'^^=('^) (37) 
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The slow Hamiltonian Hg(c) continues to be defined in terms of the original spins <t, 
assumed in a condensed state characterized by a finite overlap with the first pattern, 
and will therefore be taylored towards the recall of that particular pattern. By studying 
in the r system the properties of states which are condensed in patterns two or higher, 
we gain access to the stability of non-condensed retrieval states in the original cr system. 

The geometry- averaged free energy per spin of our new system is calculated by 
using the replica trick in its conventional form, i.e. via 



[fr] = - lim lim — ^ log <| ^ P(c) 



n 

g-/3i?f (T,c) 

. T 



— lim lim 

n^O Af^oo /3nN 



The next stages of analysis are sufficiently similar to those followed earlier to justify 
limiting ourselves to giving the final result in RS approximation. If again we assume at 
most r patterns to be condensed we find: 

[frf^ = extr{^^,^,„} J i ^ - ^a(3{q - 1)^ + ^a(3na^ " ^ ^ 



1 / JDyDz cosh"(Si) logcosh(S 



2 



^2 



/Dy cosh"(Si^ / ^ ^^^^ 

Pira-^ + y^) (40) 

p(ni-^ + ^^[y + z^qq/a? - 1] j (41) 

In addition to the previously encountered order parameters {m, g}, which relate to the cr 

system (and continue to be defined as the solution of the earlier saddle-point equations), 

we now have new order parameters {m, g, a}, whose physical meaning is found to be 

1 1 1 

rhu — lim — (ff r^) q — lim — ^ (Tj)^ a — lim — (cTjTi) 

The new order parameters are to be solved from the saddle-point equations 

. _ L JDyDz tanh(S2)cosh"(Si)\ 



JDy cosh'^(Si) 
/JDyDz tanh2(S2)cosh"(Si)^ 



\ JDy cosh"(Si) 



/JDyDz tanh(Si)tanh(S2)cosh"(Si)\ 
' ^\ JDy cosh"(H,) (^2) 

It can be shown that solutions of these equations will obey < qq (to be expected in 
view of the square root in 52). 
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Figure 6. Cross-sections for fixed n of tlie expanded pliase diagram, in which the 
previous retrieval phase R has been separated into two sub-regions: Ri defines the 
phase where only the nominated pattern can be recalled to which the geometry has 
been taylored, and R2 defines the phase where, in spite of the biased geometry, all 
stored patterns can still be recovered. From left to right: n = 1,2, 3. 



We now adopt a condensed ansatz which corresponds to the r system being in a 
condensed state which differs from that of the cr system (where the latter drives the 
geometry evolution): = m5^i, = m5^2- Solutions of this type must have a = 0, 
which is reasonable considering that any finite correlation between the r and cr systems 
makes rhi = highly improbable. For a = our two systems decouple, with the 
equations for m and q reducing to 

m = J Dz tanh(/3[m -|- z^Ja^]) (43) 
q = Jdz tanh^{P[m + z^/a^]) (44) 



These are exactly the RS equations of the model with frozen random dilution 
The rhi = a = solutions of our saddle point equations could be unstable against 
perturbations in rhi and a. In the paramagnetic phase, an expansion of the free energy 
up to second order in the order parameters gives 

[fr]^^ = extr|^,^,„,} |i(l -f3)J2ml- ^af3f + ^a(3a^ + higher ordersj 

indicating that the physical solution of the saddle-point equations is the one which 
minimizes the free energy with respect to m and a, and maximizes it with respect to 
q, as is usual in the limit n — *• Expansion around rhi = a = 0, with nonzero 7712 
and q, reveals that a second order instability in a occurs at the temperature 



n=^a{l-q)[l + in~l)q] (45) 

Below Tc the r system will be captured in the = rh6^i state, with rhi = m (so 
retrieval of states other than that to which the geometry has adapted is impossible), 
whereas above Tc the r system can be in a locally stable condensed state different from 
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that in which the cr system is found. The free energy of the rh^ = mtJ^i state is, however, 
always lower than that of other retrieval states, at any temperature. This implies that 
in the latter states the r system can be at most locally stable. The line ()45|1 has been 
calculated in RS ansatz; this seems reasonable, since in the model of RSB does not 
occur for n > 1. In figure IHl we show for n G {1,2,3} the line ()45j] which separates 
the previous retrieval phase R into two sub-regions, one Ri where only recall of one 
single pattern is possible (the one to which the geometry is taylored), and a second 
region R2 where, in spite of the biased geometry, multiple patterns can be recalled. As 
expected, increasing n (i.e. reducing the connectivity temperature, so the 'tayloring' of 
the geometry becomes more effective) reduces the size of the R2 region. 

6. Conclusion 

We have studied extremely diluted recurrent neural networks in which the geometry is 
allowed to evolve on time-scales which are adiabatically slower than the equilibration 
time of the (fast) neurons. In contrast to earlier studies, the actual values of the bonds 
remain frozen (they are here given by Hopfield's PP recipe) and only the connectivity 
is dynamic, which implies that the slow adaptation is reversible and will not wipe 
out any stored information. Our motivation was to investigate whether, by having a 
geometry dynamics which aims to reduce frustration, the information retrieval properties 
of the system can be improved. As in earlier models with slow bond dynamics 
[THl [T71 |2ni 1211 1221 UH Uni the equilibrium properties of our model are described by 
a replica theory with nonzero replica dimension n, where n = (3/(3 is the ratio between 
the temperature of the (fast) neurons and the temperature of the (slow) connectivity. 

We have calculated phase diagrams, reflecting the stationary state of the slowest 
stochastic system (i.e. the geometry). They reveal a boosting of the retrieval phase, 
compared to the frozen connectivity case, as soon as n > 0. In fact, for nonzero n 
the storage capacity diverges at low temperatures, as long as p <^ A^. This at first 
sight somewhat surprising result is explained by the observation that, in tayloring the 
geometry to the recall of a single condensed pattern, the system sacrifices the recall 
quality of an infinite number of non-nominated patterns. RSB effects are as always 
confined to small values of n (below approx. 0.32). In order to measure the expected 
reduction in frustration as a result of the geometry dynamics we have calculated the 
fraction of mis-aligned spins (where spin and local field are of opposite sign). This 
fraction is indeed found to decrease with decreasing temperature of the connectivity. In 
order to examine in which region of the phase diagram retrieval states other than the 
condensed pattern are still locally stable, we studied a pair of identical diluted networks, 
both with the same Boltzmann type connectivity distribution. The connectivity is 
taylored to reduce frustration in only the first of the two copies. This allows one to 
study scenarios corresponding to the recall of patterns (in the second copy) which are 
not the one to which the geometry is adapted. Such recall is seen to be possible, but 
only in a sub-region of the recall phase, whose size decreases with increasing n. 
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It would appear an interesting question to examine to what extent the properties of 
our model with slowly evolving geometry persist in more (biologically) realistic scenarios, 
e.g. when the average number of connections c per neuron remains finite when N oo. 
Such studies will involve order-parameter functions, see e.g. |H21I31|, and require finite 
n generalizations of finite connectivity replica theory. 
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